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Paper presented at the Joint Conference of the Australian Association for Research in Education and the New Zealand 
Association for Research in Education, Melbourne, December, 1999. 



Abstract 

Cooperatively structured laboratory sessions were used to introduce undergraduate students to basic research methods and 
statistics. In a previous evaluation of the effects of this programme using self-report measures, mathematical self-concept was 
enhanced, but mathematics anxiety did not change. This study replicated the programme but included the use of retrospective 
pre-tests in order to examine a possible contamination of the previous results through a response-shift bias. Positive effects 
were found for both mathematics self-concept and anxiety, and there was no evidence of bias resulting from a change in the 
internal standards used to complete the self-report measures. 

MATHEMATICS ANXIETY AND SELF-CONCEPT: 

EVALUATING CHANGE USING THE THEN-NOW 11 PROCEDURE 



Attitudes toward mathematics appear more polarised than for any other curriculum area. While many students enjoy 
mathematics, many others have negative attitudes (Ashcraft, Kirk, & Hopko, 1998; Fennema & Sherman, 1978; Jacobs, Watson, 
& Sutton, 1996; Stodolsky, 1985). The problem of negative attitudes is compounded by such attitudes being resistant to change 
(Sullivan, 1989; Tobias, 1995). Although it may be unsurprising that anxiety about mathematics increases across the school 
years (Brush, 1981), it does seem surprising that strong negative feelings are found among graduate students at university 
(Quilter & Harper, 1988), even among those enrolled in higher mathematics, statistics and research methods courses (Kazelskis, 
1998; Onwuegbuzie & Daley, 1999). Perhaps most ominous is the finding that these negative attitudes affect a high proportion 
of students entering the teaching profession (Kelly & Tomhave, 1985; Nisbett, 1990, 1991; Trujillo & Hadfield, 1999; Watson, 
1987). The endemic nature of this problem has led Tobias (1995) to call for greater political awareness of the social and 
pedagogical origins of this widespread mathematical disability. 

Much of the research on mathematical disability has focused on two dimensions of attitude: Mathematics self-concept refers to 
perceptions of personal ability to learn and perform tasks in mathematics (Reyes, 1984), while mathematics anxiety refers to 
feelings of tension that interfere with the manipulation of numbers and the solving of mathematical problems in a wide variety of 
ordinary and academic situations (Tobias, 1995). Because of their relationship to mathematics achievement (see Donlan, 1998), 
both dimensions have been examined in primary, secondary and university level students. With reference to the university level, 
a number of studies have examined one or both of the dimensions in general populations of students (e.g., Pajares & Urdan, 

1996; Zettle & Houghton, 1998), preservice teachers (e.g., Trujillo & Hadfield, 1999), and both undergraduate and graduate 
students taking courses in mathematics, statistics, or research methods (e.g., Bessant, 1995; Birenbaum & Shoshana, 1994; 
Bradley & Wygant, 1998; Kazelskis, 1998; Onwuegbuzie & Daley, 1999; Onwuegbuzie, 1999). However, relatively few of these 
studies have been concerned with interventions designed to influence mathematics self-concept and anxiety. 



In one study which did attempt to influence self-concept and anxiety, research methods and statistics were taught using 
cooperative learning activities over a full academic year to undergraduate students in an educational psychology laboratory 
course (Townsend, Moore, Tuck, & Wilton). Based on 153 students, post-test scores for mathematics self-concept were higher 
than the pre-test scores, but mathematics anxiety did not decrease on the self-report measure. However, in open-ended 
comments it was apparent that students had gained a great deal of confidence in using statistics, a change that should have 
been associated with decreased mathematics anxiety. It was argued that there may have been insufficient time in the course to 
influence negative emotions built up over many years, or that the instrument itself was not sensitive enough to detect the 
changes in anxiety. However, the main hypothesis advanced for the failure to see a change in anxiety was that there was a 
confluence of test administration and course assessment. At the tiimthe anxiety scale was administered at post-test, 
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students had still to complete an independent calculation and interpretation of a two-way analysis of variance involving an 
interaction. It was possible that elevated levels of anxiety associated with the pending assessment influenced the score on the 
anxiety scale. However, a further hypothesis not advanced by Townsend et al., was that a response-shift bias may have been 
operating. 

A response-shift bias (Howard, et al., 1979; Koele & Hoogstraaten, 1988) exists when a change occurs between pre-test and 
post-test in the internal standard used to provide a self-report. Self-report instruments (such as an anxiety questionnaire) are 
often used in typical pre-test-post-test evaluations of an intervention, and the usual methods of analysis assume that a 
person’s internal standard for using the scale is consistent at both times of testing (Cronbach & Furby, 1970). If the person 
applies the standard consistently, and there is a change in the score on the dimension being measured, then a "real 1 ’ change in 
behaviour has occurred. However, it is possible that the intervention itself may cause a change in the internal standard used 
between "then" (pre-test) and "now" (post-test), resulting in a confounding of any actual changes in the person’s behaviour with 
the distortion of the internalised scale. This invalidates any intrepretation of the effectiveness of the intervention (Campbell & 
Stanley, 1963). In most evaluations of training programmes using a pre-test-post-test design, it is likely that the beneficial 
effects of the training will be underestimated (Howard, et. al., 1979). 

As a solution to the problem of response-shift bias, Howard et. al., (1979) proposed that retrospective pre-test ratings be 
obtained in addition to the usual pre-test and post-test ratings. In studies involving retrospective ratings, the person is asked to 
complete both a post-test and a retrospective pre-test (of how they felt then T Any difference between the scores on the actual 
pre-test and the "then-test" would reflect the magnitude of the response-shift bias. If response-shift bias is present, then the 
effectiveness of the intervention is indicated by the difference between the post-test and the then-test scores. If a 
response-shift bias is not present then differences between the pre-test and the post-test should be used as the estimate of 
treatment effects. Studies using this methodology have examined the effectiveness of interventions in a number of areas, 
including teacher training (Bray & Howard, 1980), job satisfaction (Gutek & Winter, 1992), interviewing skill (Howard & Dailey, 
1979), and counsellor education (Manthei, 1997). Many studies have found the post-test plus retrospective pre-test scores to 
offer a better index of change than the conventional pre-test plus post-test scores (Sprangers & Hoogstraaten, 1989). 

To return to the Townsend et al., (1998) study, it is possible that the students developed a better understanding of their 
anxiety as a result of taking the course. This may have resulted, for example, in their actual pre-test scores (found in the study 
to be almost identical to the post-test scores) being an underestimate of their anxiety. The current study was designed to 
replicate the procedures of the Townsend et al., study, with a similar group of students, using the addition of the retrospective 
"then-test". The current study had two additional features not present in the original study. First, the University had changed 
from a full-year system to a semester system for the delivery of courses - the course now met more often, but was completed 
in 12 teaching weeks rather than 25. Second, the post-test measures were administered a week after the final statistical 
assignment was completed. This last procedure was included to test the "confluence" hypothesis described earlier, while the 
then-test was included to examine the possible presence of response-change bias. 

METHOD 



Participants 



The participants in this study were 141 students enrolled in a second-year course in educational psychology at the University of 
Auckland. Three groups of students were formed, the Retrospective group (n = 66), the Pre-Post group (n = 27), and the 
Pre-test group (n = 48). 

Instruments 

Mathematics Self-Concept Scale . This 27-item self-report scale (Gourgey, 1982) measured attitudes, beliefs and feelings about 
one’s ability to learn mathematics. Approximately half of the items were worded positively (e.g. "I have a good mind for 
mathematics") and half negatively (e.g. "I have a mental block when it comes to mathematics"). Item responses used a 5-point 
scale ranging from ‘strongly agree" to "strongly disagree" and, after reverse-scoring where appropriate, total scores could 
range from 27 (low self-concept) to 135 (high self concept). The scale has high reliability. Georgey reported an internal 
consistency reliability (coefficient alpha) of .96 with 92 graduate and undergraduate students in a statistics course. In the 
current study, an estimate of .96 (coefficient alpha) was obtained at the first administration. 

Mathematics Anxiety Scale . This 10-item self-report scale (Betz, 1978) measured anxiety related to doing mathematics. Again, 
approximately half of the items were worded positively (e.g. "I usually don’t worry about my ability to solve mathematical 
problems") and half negatively (e.g. "Mathematics makes me feel uncomfortable and nervous"). Item responses used a 5-point 
scale ranging from ‘strongly agree" to “strongly disagree" and, after reverse-scoring where appropriate, total scores could 
range from 10 (low anxiety) to 50 (high anxiety). The scale has high reliability. Betz reported a split-half reliability of .92 based 
on 652 undergraduate students in the United States. In the current study, an estimate of .94 (coefficient alpha) was obtained at 
the first administration. 

Open-ended Comments . At the post-test session, students were also encouraged to write comments to five questions relating to 
how well they had coped with the statistics, the amount learned, the key elements that helped their learning, whether the 
research exercises should be continued in future classes, and whether their attitude toward mathematics had changed. 
Responses were later analysed for recurrent themes. 

O 
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Procedure 

The course in educational psychology consisted of a one-hour lecture and a two-hour laboratory session per week for one 
semester (14 weeks including a two-week mid-semester break). The laboratory sessions dealt primarily with basic research 
design statistics for the social sciences: central tendency, variability, correlation, mean differences, and hypothesis testing 
using one-way analysis of variance. These issues were taught in the context of three studies in which students participated and 
wrote independent reports: an observational study of student presence and behaviour in the main library (report 1); a 
comparison of academic social responsibility in university students of different age groups (report 2); and a comparison of 
intelligence scores for "larks' 1 and "night-owls" (report 3). Each report counted 10 percent of the final grade for the course. (The 
remaining 70 percent for the final grade was based on three tests covering the lecture and textbook ( Educational Psychology . 
McCormack & Pressley, 1997) material.) 

Eight laboratory groups (of approximately 20 students each) were taught by five highly experienced tutors (three of whom were 
also the lecturers on the course). Tutors were directed to engage the students in collaborative small-group work to solve 
problems or complete sections of analysis, and to pool the outcomes with other groups; to engage students in whole-class 
discussions of major teaching points (such as the theoretical underpinnings of analysis of variance); to explore different 
problem-solving approaches; and to focus on conceptual understanding. Much of the teaching was based on the principles of 
cooperative learning (Johnson, Johnson, & Holubec, 1990; Slavin, 1994). Other details of the procedure were similar to those 
reported in Townsend et. al., (1998). 

The two instruments were administered at the first and last laboratory sessions of the course. Five laboratory groups were 
randomly assigned to the Retrospective group and three groups were assigned to the Pre-Post group. Students in the Pre-Post 
group (n = 27) completed the two instruments at the beginning and end of the course. Students in the Retrospective group also 
completed the self-concept and anxiety measures at the beginning and end of the course. However, after completing the 
instruments at the end of the course they were then given additional copies of the instruments and asked to complete them 
retrospectively, according to how they felt at the beginning of the course (the procedure outlined by Howard et. al., 1979). A 
number of students in both groups were present at the first session of the course but not present at the final session - these 
students were assigned to the Pre-test group. This group contained seven students who had officially dropped out of the course; 
the remaining students may have had several reasons for not being present at the final session (e.g. illness, deliberate 
avoidance, unofficially dropped out of the course). (A further 14 students completed instruments at the final session only. Most 
of these students had enrolled late in the course. These students were discarded in the analyses.) 



RESULTS 

It is possible that students in the Pre-Post group (dropouts, irregular attenders, etc) were systematically different in their 
self-concept and anxiety from the remaining students. A preliminary analysis was undertaken of the initial scores on the two 
instruments for the three groups. The means scores are shown in Table 1. Separate one-way analyses of variance revealed no 
significant differences between the groups in mathematics self-concept, F (2, 138) = 0.01, p = .990, or mathematics anxiety, F 
(2, 140) = 0.33, p. = .721. Thus, students who were present at both the first and last sessions were not different in initial 
self-concept and anxiety from those who were not present at the end of the course. The remainder of this paper deals with only 
the Retrospective and Pre-Post groups. 

Scores on the mathematics self-concept scale and the mathematics anxiety scale were analysed in separate Group 
(Retrospective versus Pre-Post) X Time (Beginning, End) analyses of variance, with repeated measures on the Time factor. For 
mathematics self-concept there was a significant main effect for Time, with the mean score being higher at the end of the 
course (M = 90.30, &D = 20.12) than at the beginning of the course (M = 83.53, SD. = 20.57), F (1, 91) = 27.23, p < .001 (Pillai’s 
Trace). The main effect for Group was not significant (p. = .724), nor was the interaction effect (p = .484). For mathematics 
anxiety there was also a significant main effect for time, with the mean score being lower at the end of the course (M = 28.88, 
SD = 9.04) than at the beginning of the course (M = 31.59, £D = 9.85), F (1, 91) = 27.23, p < .001 (Pillai’s Trace). Again, the 
main effect for Group was not significant (p = .688), nor was the interaction effect (p. = .876). The mean scores associated with 
these analyses are shown in Table 1. 

Table 1 

Means and Standard Deviations on the Mathematics Self-Concept and Anxiety Scales as a Function of Group and Time of 
Administration . 



Beginning End Retrospective 



Group 

Measure Mean SD Mean SD Mean SD 



Retrospective 

O 
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Self-Concept 83.35 20.97 89.56 20.75 83.38 21.08 
Anxiety 31.86 9.83 29.09 8.96 32.65 9.71 
Pre-Post 

Self-Concept 83.96 19.92 92.11 18.75 
Anxiety 30.93 10.08 28.37 9.37 
Pre-Test 

Self-Concept 83.40 17.94 
Anxiety 30.48 7.80 



In order to examine the differences between the estimates of self-concept and anxiety made at the beginning of the course, the 
end of the course, and retrospectively about the beginning of the course, scores for the Retrospective group were examined in 
separate one-way repeated measures analyses of variance for each measure. There were significant effects associated with 
test version (beginning, end, retrospective) for both the mathematics self-concept scale, E (2, 130) = 10.54, p. < .001, and for 
the mathematics anxiety scale, E (2, 130) = 12.19, p < -001- Pairwise comparisons (using the Least Significant Difference test) 
revealed that the judgements made retrospectively about the beginning of course were not significantly different from the 
actual judgements made at the beginning of the course for both self-concept (p = .985) and anxiety (p = .290). The 
retrospective judgements were, however, significantly different from the judgements made at the end of the course, for both 
self-concept and anxiety (p < .001). As expected from the previous analysis, pairwise comparisons made between the beginning 
and end scores were also significant for both measures (p < .001). In brief, students showed increased self-concept and lowered 
anxiety at the end of the course, and their retrospective judgements matched the actual judgements that they had made at the 
beginning of the course. 

Finally, an analysis was made of the 93 students’ written answers to the open-ended questions. Most students felt that they had 
enjoyed the laboratory work, had coped reasonably well with the statistics, had learned something, and attributed their learning 
to a combination of the "excellent" laboratory notes supplied, the supportive tutors, and the pace of the tasks. They were also 
positive about the need to continue such research exercises for future classes. Of most interest were the comments about 
whether their attitudes to mathematics had changed during the course. Of 82 students making comments, 38 responded 
positively (e.g. "I thought I was a math illiterate, but I [now] realise I’ve never been shown how the calculations were done", "I 
started extremely aversive to math. I do feel a lot more comfortable now"). A further 35 students reported no change in their 
attitude. Of these 35 responses, 14 students were positive about their skill (e.g., "No [there has been no change], I’ve always 
felt capable", "No, I’ve always liked maths"), five students expressed a negative attitude ("No, maths bores me”, "No, I think 
there will always be a fear for me no matter what the change in instruction or ability"), and 16 responses could not be classified 
as either positive or negative (usually because students simply wrote "No [change]"). Two students reported a negative change in 
attitude (e.g., "It got worse. Now I realise I am really slow at maths"). Finally, seven comments could not be classified in terms of 
whether there had been a change, or the direction of the change (e.g., ”1 still cringe when maths is involved if I don’t understand 
it, but I’m fine when I do"). Overall, the majority of students reported either becoming more positive toward mathematics during 
the course or maintaining their positive attitude. 



DISCUSSION 

This study had been designed to examine whether previous a previous finding of "no difference" between pre-test and post-test 
scores in mathematics anxiety could be due to a response-shift bias. However, in this study the finding of "no difference" did not 
eventuate - there was a decrease in anxiety (as had been anticipated in the original study). This made no case for the 
retrospective then-test to answer. However, the fact that the pre-test and then-test scores were almost identical strongly 
supports the argument that participation in the course had not changed the internal standards associated with the pre- and 
post-test use of the self-report measures of anxiety and self-concept. This strengthens the claim that mathematics anxiety 
had decreased during the course. These results confirm the "then-now" procedure (Howard et. al., 1979) as a useful technique in 
studies of such interventions. 

The finding that anxiety decreased over a compressed 12-week course offers no support for the suggestions made by Townsend 
et. al. (1998) that 25 weeks may be insufficient to influence affective responses associated with using mathematics, and that 
the measurement instrument was insufficiently sensitive to detect changes in anxiety. Rejection of these hypotheses 
strengthens the likelihood that the earlier finding may have resulted because the administration of the anxiety instrument 
coincided with the final statistical activity in the course. It may be that while the course served to decrease general anxiety 
associated with mathematics, state anxiety associated with the final assignment influenced the scores on the anxiety scale. This 
reinforces the need to consider mathematics anxiety as both a state and a trait variable, and to distinguish these from test 
anxiety (Anton & Klisch, 1995). 
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It is possible that other changes associated with the course could explain the difference in results between the two studies. A 
new textbook was adopted and there were some reductions in teaching personnel. The course was also restructured in response 
to the new semester length. The number of laboratory reports was reduced from four to three, and their assessment value was 
increased from 20 percent to 30 percent of the final grade. It might be argued that compression of time and increased value of 
the assessment would not lead to lower levels of anxiety, although the reduced number of reports might. Perhaps more 
significantly, the report that was dropped was of the most complex assignment, the calculation and interpretation of a 
two-factor analysis of variance with interaction. The conceptual issues of this procedure were still taught, but students were 
not required to complete an assessment on this aspect of the course. On one hand this does not seem likely to cause reduced 
anxiety. The written teaching notes and the actual teaching of the material were highly scaffolded, and utilised the same 
experienced, supportive tutors as previously. Furthermore, this section of the course had never been singled out by students for 
particular criticism in previous years. Finally, the consistent findings across both studies with regard to mathematics 
self-concept and the comments made by students do not suggest a disjunction between the conditions of the two studies. It 
seems more likely that the reduced anxiety in the current study resulted from completing the anxiety scale after completion of 
all assessment exercises; however, the influence of the dropped report cannot be ruled out. 

In summary, the negative attitudes often found in students taking courses involving mathematics and statistics can be improved 
with the use of small-group, cooperative activities within a supportive learning environment, in which the students are 
participants in the research process. The addition of evaluative research techniques such as retrospective pre-tests helps 
reinforce this conclusion. 
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